Neural Network Compression Framework for Fast Model Inference
نویسندگان
چکیده
We present a new PyTorch-based framework for neural network compression with fine-tuning named Neural Network Compression Framework (NNCF) (https://github.com/openvinotoolkit/nncf) . It leverages recent advances of various methods and implements some them, namely quantization, sparsity, filter pruning binarization. These allow producing more hardware-friendly models that can be efficiently run on general-purpose hardware computation units (CPU, GPU) or specialized deep learning accelerators. show the implemented their combinations successfully applied to wide range architectures tasks accelerate inference while preserving original model’s accuracy. The used in conjunction supplied training samples as standalone package seamlessly integrated into existing code minimal adaptations.
منابع مشابه
GAZELLE: A Low Latency Framework for Secure Neural Network Inference
The growing popularity of cloud-based machine learning raises a natural question about the privacy guarantees that can be provided in such a setting. Our work tackles this problem in the context where a client wishes to classify private images using a convolutional neural network (CNN) trained by a server. Our goal is to build efficient protocols whereby the client can acquire the classificatio...
متن کاملA New Fast Neural Network Model
In this paper, a new model for testing patterns with neural networks is presented. The idea is to accelerate the operation of testing patterns by using neural networks. This is done by applying cross correlation between the input patterns and the input weights of neural networks in the frequency domain rather than time domain. Furthermore, such model is very useful for understanding the interna...
متن کاملASCOC: A Locally Recurrent Neural Network Model for Grammatical Inference
Grammatical Inference Dit-Yan Yeung, Kei-Wai Yeung Department of Computer Science Hong Kong University of Science and Technology Abstract| Many real-world problems require the handling of temporally related events. Recently, some recurrent neural network models were proposed to solve various temporal sequence processing problems. One of the most popular models is Elman's simple recurrent networ...
متن کاملCALYPSO: A Neural Network Model for Natural Language Inference
The ability to infer meaning from text has long been regarded as one of the “benchmarks” of the quest to artificially approximate human intelligence. The field of Natural Language Inference explores this task by explicitly modeling inference relationships in natural language. In this work, we present the CALYPSO model, which builds upon Chen et al. ’16’s EBIM model by enhancing the matching lay...
متن کاملArtificial Neural Network Model for Predicting Insurance Insolvency
In addition to its primary role of providing financial protection for other industries the insurance industry also serves as a medium for fund mobilization. In spite of the harsh economic environment in Nigeria, the insurance industry has been crucial to the consummation of business plans and wealth creation. However, the continued downturn experienced by many countries, in the last decade, se...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture notes in networks and systems
سال: 2021
ISSN: ['2367-3370', '2367-3389']
DOI: https://doi.org/10.1007/978-3-030-80129-8_17